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Abstract 

Hairpin completion and its variant called bounded hairpin completion 
are operations on formal languages, inspired by a hairpin formation in 
molecular biology. Another variant called hairpin lengthening has been 
recently introduced and studied on the closure properties and algorithmic 
problems concerning several families of languages. 

In this paper, we introduce a new operation of this kind, called hairpin 
incompletion which is not only an extension of bounded hairpin comple- 
tion, but also a restricted (bounded) variant of hairpin lengthening. Fur- 
ther, the hairpin incompletion operation provides a formal language the- 
oretic framework that models a bio-molecular technique nowadays known 
as Whiplash PCR. We study the closure properties of language families 
under both the operation and its iterated version. 

We show that a family of languages closed under intersection with reg- 
ular sets, concatenation with regular sets, and finite union is closed under 
one-sided iterated hairpin incompletion, and that a family of languages 
containing all linear languages and closed under circular permutation, left 
derivative and substitution is also closed under iterated hairpin incomple- 
tion. 

1 Introduction 

In these years there has been introduced and intensively investigated an op- 
eration called hairpin completion in formal language theory, inspired by intra 
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molecular phenomena in molecular biology. A hairpin structure is well-known 
as one of the most popular secondary structures for a single stranded DNA (or 
RNA) molecule to form, with the help of so-called Watson- Crick complemen- 
tarity and annealing, under a certain biochemical condition in a solution. 

This paper continues research directed by a series of works started in [5] 
where the hairpin completion operation was introduced, followed by several 
other related papers ([9j [TTJ HI]); where both the hairpin completion and its 
inverse operation (the hairpin reduction) were investigated. 

Inspired by threefold motivations, we will introduce the notion of hairpin 
incompletion in this paper. Firstly, the hairpin incompletion is a natural ex- 
tension of the notion of bounded hairpin completion introduced and studied in 
[4 which is a restricted variant of the hairpin completion with the property 
that the length of the prefix (suffix) prolongation is constantly bounded. Thus, 
the bounded hairpin completion involves the lengthening of prefix (suffix) with 
a constant length of the strand at the end, which implies that the resulting 
strand always bears a specific property that its prefix and suffix always form 
complementary sub-strands of a certain constant length. In contrast, our notion 
of hairpin incompletion can produce a resulting strand with more complexity, 
due to the nature of its prolongation, which will be formally explained later. 

Secondly, the hairpin incompletion is also regarded as a restricted variant 
of the notion of hairpin lengthening recently introduced in |10) which is an 
extension of the (original) notion of the hairpin completion. More specifically, 
the hairpin lengthening concerns the prolongation of a strand that allows to stop 
itself at any position in the process of completing a hairpin structure. From the 
practical and molecular implementation point of view, here we are interested 
in the case where the prolongation in the hairpin lengthening is bounded by a 
constant, which leads to our notion of the hairpin incompletion. In this respect, 
one may take the hairpin incompletion as the bounded variant of the hairpin 
lengthening. 

Thirdly, the hairpin incompletion can provide a purely formal framework 
that exactly models a bio-molecular technique called Whiplash PCR that has 
nowadays been recognized as a promising experimental technique and has been 
proposed in an ingenious paper [5] by Hagiya et al. They developed an experi- 
mental technique called polymerization stop and theoretically showed in terms 
of thermal cycling how DNA molecules can solve the learning problem of /i- 
formulas (i.e., Boolean formulas with each variable appearing only once) from 
given data. Suppose that a DNA sequence is designed as given in (a) of Figure 
1, where a sequence of transition (program) is delimited by a special sequence 
(called stopper sequence) and a and its reversal complementarity a R may hy- 
bridize, leading to a hairpin structure (b). Then, the head a R (current state) 
is extended by polymerization (with a primer a R and a template 7) up to ^y R , 
where the stopper sequence is specifically designed to act as the stopper. In 
this way, this cycle can execute one process of state transition and be repeat- 
edly performedj. Following the work of 3 , Sakamoto et al. has shown how 

1 Adleman has named this experimental technique whiplash PCR 
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Figure 1: (a)The structural design of Whiplash PCR molecule ; (b) hairpin 
formation with stem part a ; (c) polymerization extension of 7 ; (d) simulation 
of one state transition. 



some NP-complete problems can be solved with Whiplash PCR (or Whiplash 
machines) ([II])- Recently, Komiya et al. has demonstrated the applicability 
of Whiplash PCR to the experimental validation of signal dependent operation 

(0)- 

The paper is organized as follows. After providing the definitions of the basic 
concepts used in the paper, we define the central notion of hairpin incomple- 
tion (as an extension of the bounded hairpin completion and also as a bounded 
variant of the hairpin lengthening) in Section 2. We first show in Section 3 
that any family of languages with a certain closure properties is closed under 
the hairpin incompletion. We then consider the case of applying the iterated 
hairpin incompletion operations, and show that every AFL is closed under the 
iterated one-sided hairpin incompletion. This result is further extended to the 
general case of the iterated hairpin incompletion, and it is shown that any family 
of languages including all linear languages and with a certain closure proper- 
ties is also closed under the iterated hairpin incompletion, and as a corollary 
that the family of context-free languages is closed under the iterated hairpin 
incompletion, followed by a brief discussion with concluding remarks in Section 
4. 



2 Preliminaries 

2.1 Basic definitions 

This paper assumes that the reader is familiar with the basic notions of for- 
mal language theory [TS]. In particular, for the notions of abstract family of 
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languages, we refer to [IB] , 

For an alphabet V, V* is the set of all finite-length strings of symbols from 
V, and A is the empty string, while V + denotes V* — {A}. For w £ V* , \w\ is 
the length of w. For fc > 0, we define V^ k = {w £ V* \w\ > fc}. Note that for 
a set S, | S\ denotes the cardinality of S. 

For k > 0, let prefk(w) and sufk(w) be the prefix and the suffix of a word iz; 
of length fc, respectively. For fc > 0, we define Pref<k(w) = {prefi(w) | < i < 
k} and Suf<k(w) — {sufi(w) | < i < k}. For k > 1, let Infk(w) be the set of 
infixes of w of length fc. If |w| < k — 1, then prefk(w), sufk(w) and Infk{w) are 
all undefined. (Note that for w £ V + , prefkiw) and sufk(w) are elements in 
Infk(w).) By wi we simply denote (L{u>}), i.e., the concatenation 

of ui with a language L. The left derivative of a language L with a word w is 
defined by w\L — {x (z V* \ wx £ L}. For a word u> = cl\cl2 ■ ■ ■ a n £ V* , is 
the palindrome of w, that is, {0,10,2 ■ ■ ■ a n ) R = a n ■ ■ ■ 0201. 

A morphism h : V* —> U* such that h(a) G U for all a e V is called a coding, 
and it is called a weafc coding if /i(a) G [/ U {A} for all a £ V. 

An involution over is a bijection cr : 1/ — > 1/ such that a — a^ 1 . In 
particular, an involution a over 1/ such that a(a) ^ a for all a £ is called 
Watson- Crick involution (in molecular computing theory) in a metaphorical 
sense of DNA complementarity. 

In this paper, we fix an involution ' over V such that a = a for a G V and 
extend it to I^* in the usual way. Note that for all x,y £ V^*, it holds that 
(x) R = x R . 

2.2 Hairpin incompletion— A bounded variant of hairpin 
lengthening 

For the original definitions of the (unbounded) fc-hairpin completion, the reader 
is referred to precedent papers (for example, [IJOIE])- A variant of the notion 
called bounded fc-hairpin completion and its modified operation were introduced 
and investigated in j4] and [8] , respectively, while a recent paper [10] introduces 
and studies an extended version of the hairpin completion, called hairpin length- 
ening. 

In this paper, we are interested in a new variant of both the bounded fc- 
hairpin completion and the hairpin lengthening which will be introduced as 
follows. 

Let m, fc > 1. For any w £ V* , we define the m-bounded k-hairpin incom- 
pletion of w, denoted by HI mt k(w), as follows: 

rHI m , k (w) = {w^ R I w = S^fal3a R , \a\ = fc, I7I < m, a, /3,7,<5 £ V*}, 
lHI m , k (w) = {j R w I w = apa R ^8 1 \a\ = fc, |-y I < m > ®,l3,l,o~ e V*}, 
HI m ,k{w) = rHI mtk (w) U lHI nitk (w). 

where rHI m ^ (or lHI m ^) is called m-bounded right (or left) k-hairpin incom- 
pletion. Moreover, m-bounded right (or left) fc-hairpin incompletion is also 
called m-bounded one-sided fc-hairpin incompletion. (See Figure [21 for pictorial 
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illustration of the operations rHI mjk and lHI m ^.) Thus, from a mathematical 
viewpoint, we consider the hairpin incompletion operations whose prolongations 
take place at both ends in a hypothetical (and ideal) molecular biological setting. 

Note. For w £ V* not satisfying the condition to apply the m-bounded 
fc-hairpin incompletion, here we assume r HI mtk (w) = I HI m k (w) = {w}. 

The iterated version of the m-bounded right fc-hairpin incompletion is de- 
fined in a usual manner : 



The " left" counterpart of the iterated version of this operation is defined in an 
obvious and similar manner and is denoted by IHI^ k (w). 

Further, the iterated version of the m-bounded fc-hairpin incompletion op- 
eration is defined in a similar manner as follows: 



Finally, the iterated version of the m-bounded (right or left) fc-hairpin incom- 
pletion operation is naturally extended to languages as follows : 



Note that the bounded hairpin mcompletion in this paper is an extension of 
bounded hairpin completion in the sense that HI m ^(w) is exactly the same as 
mHCk(w) in [3] when the prefix (suffix) S of w is empty. Further, the hairpin 
lengthening HL k (w) in |10| is corresponding to the union of all HI m k(w), where 
m is arbitrary, in this paper. 

3 Main Results 

3.1 Non-iterated bounded hairpin incompletion 

As is expected from the definitions, non-iterated bounded hairpin incompletion 
operation behaves as the bounded hairpin completion operation does. 

Theorem 1. Let C be a class of languages and m, k > I. If C is closed under 
gsm-mappings, C is also closed under m-bounded k-hairpin incompletion. 

Proof. For any m, fc > 1, consider a generalized sequential machine (gsm) g m ,k 
which adds a suffix (or prefix) 7^ of length at most m to the input word w if w 
is of the form 5^a(3a R (or apa R ^8) with \a\ = fc, \j\ < m. It is easily shown 
that this gsm simulates m-bounded fc-hairpin incompletion HI m ^k{w). □ 




rHI m . k (rHI^ k (w)) for n > 0, 
U n > rHI^ k (w). 



HI m,k( W ) 



HI mrk (HI^ k (w)) forn>0, 
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Figure 2: (a) m-bounded right fc-hairpin incompletion operation ; (b) m- 
bounded left fc-hairpin incompletion operation, where |a| = fc and |7| < m. 

Since every tric0 is closed under gsm mapping ( [18] ) , the following is straight- 
forwardly obtained. 

Corollary 1. Every trio is closed under m-bounded k-hairpin incompletion for 
any m, k > 1. 

This result extends the corresponding one (Proposition 1) in [4., while it is 
contrast to the results (Propositions 1 and 2) in [ID] . 

3.2 Iterated bounded one-sided hairpin incompletion 

In this section, we consider the closure property of iterated bounded one-sided 
hairpin incompletion. Especially, we show that every AFL is closed under this 
operation. To this aim, we start by preparing some notions required in the proof 
of the main result. A key idea of the proof is to construct a certain equivalence 
relation which is right invariant and of finite index. 

First, we consider the iterated m-bounded right fc-hairpin incompletion op- 
eration : rHI* m k . 

Definition 1. Given m, k > 1 and a word w 6 y> 2k f we define : 
C m ,k(w) = {{xy, z) I xy e (J Inf i+k (w), \y\ = fc, 

0<i<m 

w = W!xyw 2 , z G Suf< k (w 2 ) n Pref< k (y R )} 7 
D m ,k( w ) = {Cm,k( w )> U { su fi+k-i(w)})- 

0<i<m 

2 A non-empty family of languages closed under A-free morphisms, inverse morphisms and 
intersection with regular languages. 
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We also define a binary relation =D m _ k as follows : For W\,W2 G V- 2k , 



Intuitively, a pair (xy, z) in C m ,k{w) implies that it is a candidate of (7a, a ) 
where a and 7 satisfy the conditions to apply m-bounded right fc-hairpin in- 
completion to w, producing a word in rHP m k (w). 

From the definition, it holds that (ja,a R ) is in C m . k (w) with \a\ = k if and 
only if xify R is in rHI m , k (w). 

The binary relation =D m k is clearly an equivalence relation and of finite 
index, that is, the number of equivalence classes \V- 2k / =D m k | is finite. More- 
over, the following claim holds. 

Claim 1. The equivalence relation =D m k is right invariant, that is, forw\,W2 G 
V- 2k , w\ =D mk w 2 implies that for any r G V* , ui\r =D m k w^r. 

Proof. We prove it by induction on the length of r. If |r| = 0, then the claim 
trivially holds. Assume that wi =D m k w 2 implies that w\r =D m k w 2 r with 
\r\ > 1. Then, it suffices to show that for any a G V, D m ^(wira) — D m ^(w2ra). 
We observe that D m ^ k (wira) is constructed from only D m ^(w\r) as follows: 



={sufi + k-2(wir) ■ a\0 < i < m, i + k > 2, \wir\ > i + k — 2} 
(U{A} if k = 1), 

C m ,k(wira) ={(x, A) | (a;, A) G C m . k (uiir)} 

U {(sufi + k-i(wir) ■ a, A) | < i < m, \w\r\ > i + k — 1} 

U {{xy, za) I (xy, z) G C m , fe (wir), \y\ =k,zae Pref< k (y R )}. 

Note that if (xy,z) G C m _ k (wir), then w\r = w\xyw'[z for some w',w v G V* , 
so that wira can be rewritten by w'ixyw"za. Therefore, {(xy,za) \ (xy,z) G 
C m ,k(wir), \y\ — k,za G Pref< k (y R )} is contained in C m . k (wira). 

From the induction hypothesis, since D m ^ k (w\r) = D m ^ k (w2r), we can con- 
struct D m ^ k (ui2ra) from only D mjk (w\r) in the same way. Thus, it holds that 
D m . k (wira) = D m , k (w 2 ra). □ 

We first show that the language obtained by applying iterated bounded right 
hairpin incompletion to a singleton is regular. 



w i =D m , k w 2 iff An^Oi) = D mtk (w 2 ). 




0<i<m 
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[Regular grammar G w ] 

Let's consider the equivalence classes : 

v^ 2k / = Dm , k = (H, H, . • , [w t ] I wi g v^ 2fe , 1 < * < t}, 

where Wi is the representative of [wij. For w £ V- 2k , the regular grammar 
Gw — (A, V, P, S) is constructed as follows : 

N={S}U{Di\l<i<t}, 

P ={S -> wDi\we V- 2k ,w= Dmk Wi} 

U {Di -> rDj | ( 7 a, a fl ) € C m , fc (u?i), |a| - fc, 

r = 7 f R , Wir =n m , fc Wj, l<i,j <t} 

U{Di ->■ A 1 1 < i < *}. 

We need the following two claims. 

Claim 2. Letw be in V- 2k , and Di,Dj £ N. Then, forn > 0, if a derivation of 
G w is of the form wDi =>-" wrDj for some r £ , £/ien it holds wr = u m k Wj . 

Proof. The proof is by induction on n. If n = 0, then i = j and from the manner 
of constructing P, it holds w =D m k Wj, thus, the claim holds. Assume that the 
claim holds for n > and consider a derivation of the form wDi wrDj => n 
wrr'Dh for some Dh £ N, r' € T^*. From the assumption and the form of P, 
it holds that wr =D m k Wj and Wjr' =D m k Wh- By Claim [1] we obtain that 
wrr 1 =D m k Wjr 1 =D m k Wh- □ 

Claim 3. For n > and r £ V*, t/iere exists a derivation of G w of the form 
S => u>Z?i => n wrDj wr if and only if wr is in rHI^ k {w). 

Proof. The proof is by induction on n. If n = 0, it obviously holds that 
S =>■ wDi w if and only if io is in rHI® n k (w). Assume that the claim 
holds for n and consider the case for n + 1. 

(If Part) Let wr' £ rHI^(w). Then there exists r, 7 £ V™ such that 
idt - ' = wf*j R £ rHI„ lt k{wr) with wr £ rHP^ k {w). From the definition of C m ,ki 
{^a,a R ) is in C m ,k{wr) with |a| = fc. From the induction hypothesis and Claim 
[21 there exists a derivation : S =>■ wDi =4>™ wrDj with = £> m s w/j . Since 
( r ya,a R ) is in C m ^k{wr) — C mi k(wj), there exists the derivation S =>• wDi =>" 
wrDj =>■ wr := f R Df l =£• wrj R = wr' for some D/j £ A. 

(Only If Part) If there exists the derivation 5 => =>™ wrDj wrj R Dh 
=> wrj R for some Dh £ A, it holds that wr =D m k Wj from Claim[3J Moreover, 
from the form of P, there exists (7a, a R ) £ C m> k(wj) = C m ,k{wr). Hence, 
wrj R is in rHI mt k{wr). From the induction hypothesis, wr £ rHI^ k (w) so 
that wrj R £ rHI%%(w). □ 
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It follows from the claim that the language obtained by applying iterated 
bounded right hairpin incompletion to a singleton is regular. 

Lemma 1. For any word w 6 V* and m,k > 1, a language rHI^ k (w) is 
regular. 

Proof. In the case of w 6 V* — V~ 2k , from the definition, rHI* n k (w) = {w} is 
regular. For w 6 y> 2k it follows from Claim [3] that there exists a derivation of 
G w which derives a terminal string w' if and only if w' £ r HI^ n k (w). Thus, we 
have that L{G W ) — rHI^ k (w) which is regular. □ 

In order to show more general results, we need to prove the claims regarding 
the language rHI^ k (w). 

Claim 4. For wi,W2 G y> 2k an d n > 0, if ui\ =D m k u)2 then there exists a 
finite language F C V* such that rHI^ k (wi) — w\F and rHI^ k {ui2) = W2F. 



Proof. The proof is by induction on n. If n = 0, it obviously holds that 
rHI^ k (wi) — ui\F and rHI^ k {ui2) — W2F, where F = {A}. We assume that 
the claim holds for up to n. Let rHI^ k (wi) = w\F and rHI^ k (w2) = W2F 
for some finite language F. For any r G F, it holds that w\r =D m k u>2r from 
Claim [TJ Hence, from the induction hypothesis, there exists a finite language 
F r such that 

rHI m , k (wir) = wirF r and rHI m ^ k (w 2 r) = w 2 rF r . 
Therefore, it holds that 

rHI n+k( w i) = rHI mtk ( Wl F) = |J Wl rF r = Wl [j rF r = Wl F', 

reF reF 

rH%$(w2) = rHI m , k (w 2 F) = [j w 2 rF r = w 2 [j rF r = w 2 F\ 

reF reF 

where F' = U reF rF r . □ 

Claim 5. For w\,w 2 G V- 2k , if Wi =D mfc W2 then there exists a regular 
language R CV* such that rHI* z k (wi) — w\R and rHI^ k {w2) = W2R- 

Proof. From Claim |4j if w\ =D mk W2, then there exists a sequence of finite 
languages : Fo, Fx, F2, ■ ■ ■ , where F n C V*(n > 0), with the property that for 
n > 0, rHI* k (wi) = w t F n and rHI™ k (w 2 ) = w 2 F n . Then it holds that 



rHI^ k ( Wl ) = (J rHI^ k { Wl ) = |J Wl F n = Wl \J F n , 

n>0 n>0 n>0 

rHr„ hk (w 2 ) = |J rHI^ k (w 2 ) = [j w 2 F n = w 2 (J F n . 



n>0 n>0 n>0 



Let R = \J n>0 F n . Then, we obtain rHI^ k (wi) = wiR and rHI^ n k {w 2 ) 
T i ll Recall that w\R and W2R are regular from LemmaU The class of regular 
languages is closed under left derivative, so that R is also regular. □ 
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We are now in a position to show the main theorem in this section. It is 
shown that iterated bounded one-sided hairpin incompletion can be simulated 
by several basic language operations, which leads to the following theorem. 

Theorem 2. Let C be a class of languages and m, k > 1. If C is closed under 
intersection with regular languages, concatenation with regular languages and 
finite union, then C is also closed under iterated m-bounded right (left) k-hairpin 
incompletion. 

Proof. Let L G L be the language over V. We can write L = L\ U L 2 where 
Li = L n V 2k ■ V* = {w E L | \w\ > 2k}, 
L 2 = LC\ (J V n = {w G L\ \w\ < 2k}. 

0<n<2fc-l 

Note that rHI* l k (L) = rHI* n k (Li) U rHI* mk (L 2 ) = rHi; n k {L } ) U L 2 . Since 
the number of the elements in L\j =D m k is finite from the definition of =n m kl 
we can set L\j = Dm k — {[wi], [w 2 ], ■ ■ • , [w s ] \ Wi € L\ for 1 < i < s} for some 
s > 0. From the way of construction of D mt k(wi), it holds that for 1 < i < s, 

[w i ]=L 1 n( f| V*xyV*z)n{ fl V* ■ sufj+k^Wi)), 

(xy,z)€C m , k (w t ) 0<j<rn 

For 1 < i < s, since all words in [wi] are equivalent, it follows from Claim[5]that 
there exists regular language R t such that rHI^ k ([iVi]) = [lUjJ-Rj. Moreover, it 
holds that rHI* m k {L x ) = \Ji <i<s [w l )R l . Thus, rHI* m k (L) can be constructed 
from L by intersection with regular languages, concatenation with regular lan- 
guages and finite union, which completes the proof. □ 

As a corollary, we immediately obtain the following. 

Corollary 2. Every AFL is closed under iterated m-bounded right (left) k- 
hairpin incompletion for any m,k > 1. 

It is known in |14j that there exists no universal regular grammar G u (x) = 
(V, S, P, x) with the property that for any regular grammar G, there exists a cod- 
ing wg of G such that L(G) = L(G u (wg))- This can be strengthened in the form 
that no morphism h can help to satisfy the equation L{G) = h{L{G u {wG)))- 

In this context, the next lemma shows that the bounded hairpin incompletion 
operation can play a role of the universal-like grammar for all regular languages. 

Lemma 2. A language L C V* is regular if and only if there exists a word 
w G (V')* and a weak coding h : V' — > V such that L = h(rHIi 1 (w) (~l 
(V - {#}) *V"), where # £ V and V" C V . 

Proof. (If Part) This clearly holds, because the class of the regular languages 
is closed under iterated bounded right hairpin incompletion, intersection and 
weak codings. 

(Only If Part) For a regular grammar G — (N, V, P, S), we construct V 1 ,V" , 
w G V and h : V 1 — > V as follows: 
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• V = {[a, X] | a G V, X G N U {A}} U {[a, I]|oe^,Ie iV} U {#, #}, 

• V"' = {[a, A] |a€ V}, 

.w=( n #rara-[A,4 

• = a for A = [a,JT] G {[a,X]|a G V, X G N U {A}}, h(A) = A 
otherwise. 

Note that for any n > and w' = 5~fapa R G rHI^ x {w) n (V - {#})* with 
|a| = |7| = 0, if w^y R G ri/I^+^w) n (V - {#})*, then 7 is the symbol just 
right of Then, from the way of construction of w, it holds that there exists 
a derivation of G, 

5 =>■ aiXi aia 2 X 2 =>■ • • • =3- ai<2 2 . . . a„-il n -i a 1 a 2 ■ . . a„_ia„, 

if and only if 

u/ = ( [] # [a, X,] [6, Xi]) [A, 5] [a x , Xr] [a 2 , X 2 ] . . . [a n _ X) X„_i] [a„, A] 

X«->aX,-e.P,6eV 

is in rHIi 1 (w) n (V — {#})*, which can be shown by induction on n. By 
applying h, we obtain L(G) = h^HIl ^w) n (V - {#})*V"). □ 

We note that Theorem 3 in |10) proves the only if part of this lemma for the 
iterated (unbounded) hairpin lengthening. Thus, Lemma [2] complements the 
result for the case of bounded hairpin lengthening. 

3.3 Iterated bounded hairpin incompletion 

In this section, we consider the closure property of iterated bounded hairpin 
incompletion. For the (unbounded) hairpin lengthening operation, the paper 
[TU] has proved that the family of context-free languages is closed under iterated 
hairpin lengthening in Theorem 4. We will show that the result also holds for 
the case of iterated bounded hairpin lengthening, in a more general setting of 
AFL-like formulation. 

The proof is based on the similar idea to the previous section and Claim [TJ 
[HO are corresponding to Claim O [5] (below), respectively. 

In order to consider both-sided hairpin incompletion, we modify the equiv- 
alence relation. 

Definition 2. For m, k > 1 and the word w £ V- 2k , C' m k (w), D' m k (w) and 
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E m ,k(w) are defined by 

C' m<k {w) = {{z,yx) \yx € [J Inf i+k (w), \y\ = k, 

0<i<m 

w = wiyxw 2 , z e Pref< k (wi) n Suf< k (y R )}, 
D' m>k (w) = (C' m>k (w), \J {pre/i+fc-iM}), 

0<i<m 

E m ,k(w) =< D„^ k (w),D' mk (w) > . 

, where D m ^[w) is the relation defined in Definition^ 

The binary relation =E m k is defined as w\ =E m k W2 «/£ m ,fc(wi) = E m>k (u)2) 
for wi,W2 G V^ 2fe . 

The binary relation = # m fc is clearly an equivalence relation and of finite 
index. Note that D m ^ and D' m k are symmetrically defined. 

We show that the equivalence relation = E m k is right invariant and left in- 
variant. 

Claim 6. The equivalence relation =E m k is right invariant and left invariant, 
that is, for Wi,W2 S V- 2k , if wi =E m k W2 then for any r,l <E V* , wir =D m , k 
W2r and Iwi =E m k lw 2 holds. 

Proof. We firstly show that for r £ V* , w±r =r> m k W2r. The proof is by 
induction on the length of r. If \r\ = 0, it clearly holds. We assume that the 
claim holds for n, i.e., w±r =E m k w 2 r with \r\ = n. Let a be a symbol in V. 

[Proof of D m ^ k {w\ra) = D rrltk (w2ra)] It can be shown by the same way as 
Claim [1] 

[Proof of D' m k (w\ra) — D' mk {w2ra)] We construct D' mk {w\ra) from only 
E m ,k(wir) as follows: 

|J {pref i+k -i(wira)} ={pref i+k _i{wir) <i < m, \wir\ > i + k - 1} 

0<i<m 

( U {wira} if \wir\ < m + k — 1), 

C' m ,kiwira) = C' m<k (wir) 

U {(A, sufi +k -i(w\r) ■ a) | < i < m, \w±r\ > i + k — 1} 

U {(z, sufi +k -i(w\r) ■ a) | < i < m, > |z| + i + k — 1, 

z G Pref< k (wir) n Suf<k(sufi +k -i{wir) ■ a )}. 



Note that for < i < m, z £ Pref< k (w\r) D Suf< k (sufi+ k -i(wir) ■ a ) with 
|wir| > |z| + i + — 1 and some z' G V*, wira can be represented as w\ra — 
z ■ z' ■ suf l+k - 1 {wir) ■ a. Hence, (z, sufi + k-i(wir) ■ a) is in C' m k (wira). 

Since E m , k (wir) = E m , k (w2r) 1 we can construct D' mk (w2ra) from only 
E„ ltk (wir) in the same way. Therefore, it holds that D' m k {wira) = D' m k (u>2ra). 
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From D m ^ k (w 1 ra) = D m ^(w 2 ra) and D' mk {w\ra) = D' m k (w 2 ra) 1 we eventually 
get w\ra =E m , k w 2 ra. 

For the left invariance of = E m k , we can show in the symmetrical manner. □ 

[Linear grammar Gl] 

For the proof of Theorem [3] (below) regarding m-bounded fc-hairpin incomple- 
tion, we need to construct a linear grammar. For L C V*, let L/ =E mk = 
{A 1 ,A 2 ,...,A U } for some u > 1 and 7*/ =E mik = {[wi], [w 2 ], ■ ■ ■ , [w B ]} for 
some s > 1, where Wi is the representative of [uii\. A linear grammar Gl — 
(N, T, P, S) is constructed as follows: 

N ={S} U {E, | < i < s}, 

T =V U { ai \0 < i < u} U {$}, 

P ={5 — > Eidj | For any w G Ay, w =E m k Wi} 

U {JSj -> r£y | (7a, a R ) G C mtk (wi), \a\ =k,r = j R ,Wir = Em , k w 3 } 
U {E, -> P,Z I (a 71 , cry) G C' mjk (wi), \a\ = k,l = j R , lw t =E m , k Wj} 
U{Ei^$\0<i<s}. 

We set P P = {r | E t -> rEj G P} U {A} and L P = {7 | P, -> Pj-Z G P} U {A}. 

Claim 7. Let < p < u and Ei,Ej G N . For n > 0, if a derivation of Gl is 
of the form Eid p =>™ r\ . . . r n Ejl n . . . l\d p , then for any w G A p , it holds that 
l n . . . hwri . . . r n =E m fc Wj, where for each 1 < h < n, rh G Rp, lh G Lp, one 
of rh and lh is A and the other is not A. 

Proof. The proof is by induction on n. If n = 0, then i = j and from the manner 
of constructing P, for any w G A Pl it holds that u? =E m k Wj, thus the claim 
holds. Assume that the claim holds for n > and consider a derivation of the 
form 

Eia p => r'Ejdp => n r'ri . . . r n E h l n . . . ha p 
( Eia p =>• Ejl'a p => n r\ . . . r n E h l n . . . hi 1 dp ) 

for some Eh G N, r' G Rp (I' G Lp). From the assumption and the form 
of P, for any w G A p , it holds that wr' =E m k Wj (I'w =E m k uij) and 
l n . . .1 \WjTi . . . r n =E m , k Wh- By Claim we obtain that 

l n . . . hwr'ri ...r n =s m , k L ■ ■ • hwjn ...r n =E m , k w h 
{l n --- hl'wri ...r n =_B m-fe l n --- hwjri ...r n =E m , k w h ). 

□ 

Claim 8. A word r± . . . r n %l n . . . l\Oi is generated by Gl if and only if for any 
w G Ai, l n ■ ■ ■ hwri . . .r n is in HI^ k (L), where for each 1 < h < n, rh G Rp, 
lh G Lp, one of rh and lh is A and the other is not A. 
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Proof. The proof is by induction on n. If n = 0, it obviously holds that 
S =>■ Eidj => $a,j if and only if for any w € Aj, w is in IfZj), Assume 
that the claim holds for n and consider the case for n + 1. 

(If Part) Let l n +il n ■ ■ ■ hwri . . . r n r n +i & HI^ k (w), where for each 1 < h < 
n+ 1, Th G -Rp, l/i 6 Ip, one of and is A and the other is not A. From the 
definition of C m ^ and C' m k , either (j\^+i R ■ a,a R ) 

or (a R , a ■ l n +\ ) is in C' m k (l„ . . . l\wr\ . . . r n ) with \a\ = fc. From the induction 
hypothesis and Claim [71 there exists a derivation : 

with / n . . . Ziwri . . . r n =e m fc w?. Therefore, it holds that either (r^i^-a, a^) € 

Cm,k{wj) or (o7 fl ,a • Z„ + i ) S C' m k (wj), from which there exists the derivation 
either 

S => Eidp => n r\ . . . r n Ejl n . . . hd p => r x 
=> n . . . r n r n+ i%l 

<>! S Eid p => n r\... r n Ejl n . . . hd p =>■ r x 

=>■ T\ .. . r n $l n+ il n . . . l\d p 
for some Eh <E N. 

(Only If Part) Consider the case where there exists a derivation S =>• Eid v => n 
d p =>- n . . . r n r n+1 E h l n . . . hdp =>- n . . . r n r n+1 $l n . . . hd p for 
some Eh £ N. Then, it holds that for any w £ A p , l n ... hwri . . . r n =E mik Wj 
from Claim [7J Moreover, from the way of construction of P, there exists (r n+ i R ■ 
a,a ) e C m< k{wj) — C m .k{l n ■ ■ - hwri . . .r n ). Hence, l n . . . liwvi . . . r n r n +i is 
in HI mt k(l n ■ ■ ■ hwri . . . r n ). From the induction hypothesis, l n . . . hwri . . . r„ G 
HI m,k( w ) so that l n ■ ■ ■ hwri . . . r n r n+ i E HI%£(w). 

For the other case, there exists a derivation S => Eid p =^ n ri . . . r n Ejl n . . . hd t 
=^ n . . . r n E h l n +il n ■ ■ . Zi%, =4- ri . . . r„$Z n+ iZ„ . . . Zia p for some ^ e iV. Then 
we can show in a similar way that for any w € A p , l n +il n ■ ■ ■ hwri . . . r n € 

In order to prove the next result, we need a language operation called circular 
permutation cp which maps every word in the set of all its circular permutations 
and every language in the set of all circular permutations of its words. The proof 
is due to an idea similar to the one in [3]. 

Theorem 3. Let £ be a class of languages which includes all linear languages 
and let m, fc > 1. If C is closed under circular permutation, left derivative and 
substitution, then C is also closed under iterated m-bounded k-hairpin incom- 
pletion. 

Proof. Recall the construction of the linear grammar Gl- Let L be in C and 
/ be a substitution over T defined by /(aj) = Ai for {a.; | < i < u} and 



■ ■ • r n r n+ iEhl n ■ ■ ■ hdp 



■ ■ ■ TnEhhi+lln ■ ■ ■ hd-p 
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f(a) = {a} otherwise. From Claim [51 it holds that 

Lq = {r\ ■ ■ ■ r n $l n ■ ■ ■ o-i £ T, 1 < j < n, rj G Rp, lj G Lp, 

for any w G A4, l n . . . hwn ■ ■ ■ r n G HI* n k (L)}, 

where Lq = L(Gl)- Hence, it is easily seen that HI^ k (L) — f($\cp(Lc))- □ 

Since the family of context-free languages meets all of preconditions in The- 
orem [3l the following corollary holds. 

Corollary 3. The family of context-free languages is closed under iterated un- 
bounded k-hairpin incompletion for any m,k > 1. 

4 Concluding Remarks 

In many works on DNA-based computing and the related areas, DNA hairpin 
structures have numerous applications to develop novel computing mechanisms 
in molecular computing. Among others, these molecules of hairpin formation 
called Whiplash PCR have been successfully employed as the basic feature of 
new computational models to solve an instance of the 3-SAT problem (Q2]), to 
execute (and simulate) state transition systems (|16j). to explore the feasibility 
of parallel computing for solving DHPP ([6]), and so forth. On the other hand, 
different types of hairpin and hairpin-free languages are defined in [13j and more 
recently in [5| , where they are studied from a language theoretical point of view. 

We have proposed a new variant of hairpin completion called hairpin incom- 
pletion, and investigated its closure properties of the language families. The 
hairpin incompletion is in fact a bounded variant of the hairpin lengthening in 
[TU] where not only closure properties of language families but also the algorith- 
mic aspects of the hairpin lengthening operations are investigated. The hairpin 
incompletion is also an extended version of the bounded hairpin completion re- 
cently studied in [4] that has been more recently followed up by slightly modified 
operations in [8J where two open problems from [4] have been solved. 

We have shown that every AFL is closed under the iterated one-sided hairpin 
incompletion, and therefore, the family of regular languages is closed under the 
operation. Further, it has been shown that the family of context-free languages 
is closed under the iterated hairpin incompletion. These complement some of 
the corresponding results for (unbounded) hairpin lengthening operations in 
jlOj . Moreover, since the hairpin incompletion nicely models a bio-molecular 
technique (Whiplash PCR), the obtained results in this paper may provide new 
insight into the computational analysis of the experimental technique. 

It remains as an interesting open problem if the family of regular languages 
is closed under iterated hairpin incompletion. 
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